A Multi-layered Summarization System for Multi-media Archives by Understanding and Structuring of Chinese Spoken Documents
نویسندگان
چکیده
The multi-media archives are very difficult to be shown on the screen, and very difficult to retrieve and browse. It is therefore important to develop technologies to summarize the entire archives in the network content to help the user in browsing and retrieval. In a recent paper [1] we proposed a complete set of multi-layered technologies to handle at least some of the above issues: (1) Automatic Generation of Titles and Summaries for each of the spoken documents, such that the spoken documents become much more easier to browse, (2) Global Semantic Structuring of the entire spoken document archive, offering to the user a global picture of the semantic structure of the archive, and (3) Query-based Local Semantic Structuring for the subset of the spoken documents retrieved by the user’s query, providing the user the detailed semantic structure of the relevant spoken documents given the query he entered. The Probabilistic Latent Semantic Analysis (PLSA) is found to be helpful. This paper presents an initial prototype system for Chinese archives with the functions mentioned above, in which the broadcast news archive in Mandarin Chinese is taken as the example archive.
منابع مشابه
Multi-layered Summarization of Spo Information Extraction and S
The spoken documents are very difficult to be shown on the screen, and very difficult to retrieve and browse. It is therefore important to develop technologies to summarize the entire archives of the huge quantities of spoken documents in the network content to help the user in browsing and retrieval. In this paper we propose a complete set of multi-layered technologies to handle at least some ...
متن کاملHierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications
The most attractive form of future network content will be multi-media including speech information, and such speech information usually carries the core concepts for the content. As a result, the spoken documents associated with the multi-media content very possibly can serve as the key for retrieval and browsing. This paper presents a new approach of hierarchical topic organization and visual...
متن کاملStructuring Multimedia Archives with Static Documents
This article will propose to consider static documents as structured and thematic vectors towards multimedia archives and as a tool for structuring events such as meetings or conferences recordings. A method for bridging the gap between static documents and multimedia data, such as audio and video, will be presented. First, a brief state-of-the-art of existing meeting/conference/class room proj...
متن کاملThermal Behavior of a New Type of Multi-Layered Porous Air Heater
Based on an effective energy conversion method between gas enthalpy and thermal radiation, a multi-layered type of porous air heater has been proposed. In the five layered structure which is analyzed in this work, there are five porous layers which are separated by four quartz glass windows. The main layer operates as a porous radiant burner that products a large amount of thermal radiative ene...
متن کاملMulti-layered graph-based multi-document summarization model
Multi-document summarization is a process of automatic generation of a compressed version of the given collection of documents. Recently, the graph-based models and ranking algorithms have been actively investigated by the extractive document summarization community. While most work to date focuses on homogeneous connecteness of sentences and heterogeneous connecteness of documents and sentence...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006